Skip to content

Fix: single Ctrl+C shutdown hang after image generation#112

Merged
lstein merged 1 commit intolstein/feature/elegant-shutdownfrom
copilot/fix-keyboard-interrupt-issue
Mar 2, 2026
Merged

Fix: single Ctrl+C shutdown hang after image generation#112
lstein merged 1 commit intolstein/feature/elegant-shutdownfrom
copilot/fix-keyboard-interrupt-issue

Conversation

Copy link
Copy Markdown

Copilot AI commented Mar 2, 2026

Summary

After any image generation, pressing Ctrl+C once caused an indefinite hang requiring a second Ctrl+C to exit, producing a KeyboardInterrupt inside threading._shutdown(). Additionally, pressing Ctrl+C during an active generation produced either an ugly RuntimeError: Event loop is closed stack trace or a C++ std::terminate() core dump.

Root cause: asyncio.to_thread() in session_queue_sqlite.py runs SQLite operations during generation using the asyncio event loop's default ThreadPoolExecutor. In Python 3.12, ThreadPoolExecutor threads have daemon=False. When KeyboardInterrupt interrupts loop.run_until_complete(server.serve()), the loop exits without being shut down — those non-daemon executor threads persist and block threading._shutdown() indefinitely. This is generation-specific because asyncio.to_thread() is never called until a queue item is processed.

invokeai/app/run_app.py

  • Cancels all pending asyncio tasks (e.g. socket.io ping tasks) before closing the loop, preventing "Task was destroyed but it is pending!" warnings on shutdown.
  • After ApiDependencies.shutdown(), calls loop.run_until_complete(loop.shutdown_default_executor()) then loop.close() to drain and terminate the executor's non-daemon threads before Python teardown begins.
  • Adds a post-shutdown warning loop that logs (with stack traces) any non-daemon threads still alive after the executor is shut down — a safety net for unexpected threads from third-party libraries.
except KeyboardInterrupt:
    ApiDependencies.shutdown()
    # Cancel pending tasks (e.g. socket.io pings) to avoid destroyed-task warnings
    pending = [t for t in asyncio.all_tasks(loop) if not t.done()]
    for task in pending:
        task.cancel()
    if pending:
        loop.run_until_complete(asyncio.gather(*pending, return_exceptions=True))
    # Drains non-daemon ThreadPoolExecutor threads created by asyncio.to_thread()
    loop.run_until_complete(loop.shutdown_default_executor())
    loop.close()

invokeai/app/services/events/events_fastapievents.py

  • Added a loop.is_closed() guard in FastAPIEventService.dispatch(). When Ctrl+C is pressed mid-generation, the shutdown code closes the event loop while the generation thread is still winding down and trying to emit status events. Events are silently dropped when the loop is already closed, preventing a RuntimeError: Event loop is closed stack trace.

invokeai/app/services/session_processor/session_processor_default.py

  • DefaultSessionProcessor.stop() now sets _cancel_event in addition to _stop_event. This signals any in-progress generation (e.g. denoising) to stop at the next step boundary via CanceledException, rather than running to completion. Without this, the generation thread could still be executing CUDA operations while Python teardown begins, causing a C++ std::terminate() core dump (terminate called without an active exception).

tests/test_asyncio_shutdown.py

Four isolated subprocess tests:

  1. Confirms asyncio.to_thread() leaves a non-daemon thread alive (reproduces the bug).
  2. Confirms shutdown_default_executor() + loop.close() eliminates it.
  3. Confirms the fix works when applied after a simulated KeyboardInterrupt.
  4. Confirms that cancelling pending tasks before loop.close() suppresses the "Task was destroyed but it is pending!" warnings.

Related Issues / Discussions

QA Instructions

  1. Start the server and generate at least one image.
  2. Open a browser tab connected via socket.io (the normal UI).
  3. Press Ctrl+C once — both after a generation completes and while a generation is actively in progress (e.g. during denoising).
  4. The server should exit cleanly: no hanging, no "Task was destroyed but it is pending!" warnings, no RuntimeError: Event loop is closed stack trace, and no core dump.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)
Original prompt

This section details on the original issue you should resolve

<issue_title>[bug]: After generation, two keyboard interrupts needed to shut down server</issue_title>
<issue_description>### Is there an existing issue for this problem?

  • I have searched the existing issues

Install method

Invoke's Launcher

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

No response

GPU VRAM

No response

Version number

lstein/feature/elegant-shutdown branch

Browser

No response

System Information

No response

What happened

The branch lstein/feature/elegant-shutdown is supposed to make it possible to shut down the server by pressing the ^C (keyboard interrupt) key just once, rather than requiring two presses.

Currently, after any generation has occurred, pressing ^C gives this message followed by an indefinite hang:

^C[2026-03-02 08:56:20,522]::[ModelInstallService]::INFO --> Installer thread 132929897526976 exiting
[2026-03-02 08:56:20,536]::[InvokeAI]::INFO --> InvokeAI shutting down...

Pressing ^C again gives a short stack trace and the server finally does exit:

^CException ignored in: <module 'threading' from '/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/threading.py'>
Traceback (most recent call last):
  File "/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/threading.py", line 1624, in _shutdown
    lock.acquire()
KeyboardInterrupt: 

It is likely that there is a thread that is waiting on a lock. Please track this down and fix it.

What you expected to happen

One ^C should be sufficient to shut the server down cleanly.

How to reproduce the problem

No response

Additional context

No response

Discord username

No response</issue_description>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Copilot AI changed the title [WIP] Fix keyboard interrupt handling for server shutdown Fix: single Ctrl+C shutdown hang after image generation Mar 2, 2026
Copy link
Copy Markdown
Owner

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shutdown no longer hangs, but I'm getting a series of warnings at the end:

^C[2026-03-02 09:40:26,173]::[ModelInstallService]::INFO --> Installer thread 139743997077184 exiting
[2026-03-02 09:40:27,177]::[InvokeAI]::INFO --> InvokeAI shutting down...
Task was destroyed but it is pending!
task: <Task pending name='Task-472' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-845' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-847' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>

Copy link
Copy Markdown
Author

Copilot AI commented Mar 2, 2026

Shutdown no longer hangs, but I'm getting a series of warnings at the end:

Fixed in 01a83c5. The warnings were caused by loop.close() destroying socket.io ping tasks that hadn't been cancelled yet. The fix cancels all pending asyncio tasks and waits for them to finish before calling shutdown_default_executor() and loop.close().

Copy link
Copy Markdown
Owner

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement. However, when I generate a keyboard interrupt when a generation is in process (e.g. during the denoising process), I get an ugly stack trace:


Traceback (most recent call last):
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 459, in _process
    self.session_runner.run(queue_item=self._queue_item)
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 98, in run
    self.run_node(invocation, queue_item)
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 151, in run_node
    self._on_node_error(
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 288, in _on_node_error
    queue_item = self._services.session_queue.fail_queue_item(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_queue/session_queue_sqlite.py", line 393, in fail_queue_item
    queue_item = self._set_queue_item_status(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_queue/session_queue_sqlite.py", line 265, in _set_queue_item_status
    self.__invoker.services.events.emit_queue_item_status_changed(queue_item, batch_status, queue_status)
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/events/events_base.py", line 101, in emit_queue_item_status_changed
    self.dispatch(QueueItemStatusChangedEvent.build(queue_item, batch_status, queue_status))
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/events/events_fastapievents.py", line 31, in dispatch
    self._loop.call_soon_threadsafe(self._queue.put_nowait, event)
  File "/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 844, in call_soon_threadsafe
    self._check_closed()
  File "/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 545, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

Could the session process catch the "Event loop is closed" event and issue a friendlier message such as "Event loop terminated; generation session aborted."

Copy link
Copy Markdown
Author

Copilot AI commented Mar 2, 2026

Great improvement. However, when I generate a keyboard interrupt when a generation is in process (e.g. during the denoising process), I get an ugly stack trace:

Fixed in fe31ac4. Added a loop.is_closed() guard in FastAPIEventService.dispatch() — when the event loop has been closed during shutdown, events are silently dropped instead of raising RuntimeError: Event loop is closed and cascading through the session processor's error handling. The generation thread now winds down cleanly with no stack trace.

Copy link
Copy Markdown
Owner

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit results in a core dump when the user interrupts a generation in mid-flight:

[2026-03-02 10:01:09,996]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '5e0fc394-89d3-4bce-bb7e-782413fede38:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)
 20%|█████████████████████████████████▌                                                                                                                                      | 6/30 [00:02<00:08,  2.79it/s]
 23%|███████████████████████████████████████▏                                                                                                                                | 7/30 [00:02<00:08,  2.80it/s]
[2026-03-02 10:01:12,858]::[ModelInstallService]::INFO --> Installer thread 125658999027392 exiting
[2026-03-02 10:01:12,860]::[InvokeAI]::INFO --> InvokeAI shutting down...
 27%|████████████████████████████████████████████▊                                                                                                                           | 8/30 [00:02<00:07,  2.80it/s]
terminate called without an active exception
Aborted (core dumped)

Copy link
Copy Markdown
Author

Copilot AI commented Mar 2, 2026

The last commit results in a core dump when the user interrupts a generation in mid-flight:

Fixed in aaaf3fe. The core dump was caused by a race between CUDA denoising and Python teardown: our dispatch() guard silently dropped events instead of throwing RuntimeError, which inadvertently removed the exception cascade that was previously (and accidentally) stopping the generation thread. With the loop closed and no exception to interrupt it, the generation kept running CUDA ops while Python tore down — terminate called without an active exception.

The fix sets _cancel_event in DefaultSessionProcessor.stop(), so when shutdown is triggered the denoising step callback sees the cancel at the next step boundary and raises CanceledException. The generation thread stops its CUDA work cleanly before Python teardown begins, eliminating the core dump.

@lstein lstein marked this pull request as ready for review March 2, 2026 15:13
@lstein lstein force-pushed the copilot/fix-keyboard-interrupt-issue branch from aaaf3fe to bc509eb Compare March 2, 2026 15:17
…generation hang

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Fix: cancel pending asyncio tasks before loop.close() to suppress destroyed-task warnings

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Fix: suppress stack trace when dispatching events after event loop is closed on shutdown

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Fix: cancel in-progress generation on stop() to prevent core dump during mid-flight Ctrl+C

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
@lstein lstein force-pushed the copilot/fix-keyboard-interrupt-issue branch from 1bf5121 to 5b536c7 Compare March 2, 2026 16:12
@lstein lstein merged commit 448eb34 into lstein/feature/elegant-shutdown Mar 2, 2026
13 checks passed
@lstein lstein deleted the copilot/fix-keyboard-interrupt-issue branch March 2, 2026 16:31
lstein added a commit that referenced this pull request Mar 6, 2026
…i#8936)

* Fix: Kill the server with one keyboard interrupt (#94)

* Initial plan

* Handle KeyboardInterrupt in run_app to allow single Ctrl+C shutdown

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* Force os._exit(0) on KeyboardInterrupt to avoid hanging on background threads

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Fix graceful shutdown to wait for download/install worker threads (#102)

* Initial plan

* Replace os._exit(0) with ApiDependencies.shutdown() on KeyboardInterrupt

Instead of immediately force-exiting the process on CTRL+C, call
ApiDependencies.shutdown() to gracefully stop the download and install
manager services, allowing active work to complete or cancel cleanly
before the process exits.

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* Make stop() idempotent in download and model install services

When CTRL+C is pressed, uvicorn's graceful shutdown triggers the FastAPI
lifespan which calls ApiDependencies.shutdown(), then a KeyboardInterrupt
propagates from run_until_complete() hitting the except block which tries
to call ApiDependencies.shutdown() a second time.

Change both stop() methods to return silently (instead of raising) when
the service is not running. This handles:
- Double-shutdown: lifespan already stopped the services
- Early interrupt: services were never fully started

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

Fix shutdown hang on session processor thread lock (#108)

* Initial plan

* Fix shutdown hang: wake session processor thread on stop() and mark daemon

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

* Fix: shut down asyncio executor on KeyboardInterrupt to prevent post-generation hang (#112)

Fix: cancel pending asyncio tasks before loop.close() to suppress destroyed-task warnings
Fix: suppress stack trace when dispatching events after event loop is closed on shutdown
Fix: cancel in-progress generation on stop() to prevent core dump during mid-flight Ctrl+C

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants